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In the Specification: 

Change paragraph 0002 as follows : 

A technique of time-varying SNR dependent coding for increased communication 
channel robustness is described by A. Bernard ,one of the inventors herein, and A. Alwan 
in "Joint channel decoding -Viterbi Recognition for Wireless Applications in 
Proceedings of Eurospeech, Sebt. 2001, vol. 4, pp. 2703-6; A. Bernard, X. Liu, R. Wesel 
and A. Alwan in "Speech Transmission Using Rate-Compatable Trellis codes and 
Embedded Source Coding," IEEE Transactions on Communications, vol. 50, no. 2, pp 
309-320, Feb. 2002.; A. Bernard and A. Alwan . "Source and Channel Coding for low bit 
rate distributed speech recognition systems", IEEE Transactions on Speech and Audio 
Processing, Vol. 10, No. 8. pp570-580, Nov. 2202 : andA.B^aand Bernard in "Source 
and Channel Coding for Speech and Remote Speech Recognition," Ph.D. thesis, 
University of California, Los Angeles, 2002 . 
Change paragraph 001 1 as follows: 

In general, there are two related approaches to solve the temporal alignment problem with 
HMM speech recognition. The first is the application of dynamic programming or 
Viterbi decoding, and the second id the more general forward^ackward algorithm. The 
Viterbi algorithm (essentially the same algorithm as the forward probability calculation 
except that the summation is replaced by a maximum operation) is typically used for 
segmentation and recognition and the fo^ward^ackwa^d for training. See for the Viterbi 
algorithm G.D. Fomay Forney, " The Viterbi algorithm, " IEEE Transactions on 
Communications, vol. 61, no. 3, pp. 268-278,* April 1973. 
Change paragraph 0012 as follows: 



2 



.TI-37332 

The Viterbi algorithm finds the state sequence Q that maximizes the probability P* 
observing the features sequence (0=oi,. . .e^ 2i) given the acoustic model X 
P* = maxP(Q,0|X). (1) 

All e ' 

Change paragraph 0014 as follows : 

The maximum likelihood P* (O | X) is then given by P* (0\ X)=max j { 9j (T)}. 
Change paragraph 0019 as follows: 

Under the hypothesis of a diagonal covariance matrix S , the overall probability bj(ot ) 
can be computed as the product of the probabilities of observing each individual feature. 
The weighted recursive formula (equation 3) can include individual weighting factors 
Yk,! for each of the Nf front-end features. 

Nf 

(Pj,t = max [<pi,t-i ay] f| (^>t)] V.t.. (4) 

where k indicates the dimension index of the feature observed. 

Change paragraph 0022 as follows: 

In order to perform time and frequency SNR dependent weighting, we need to change the. 
way the probability bj (ot) is computed. Normally, the probability of observing the Np - 
dimensional feature vector Ot in the state is computed as follows, 

where Nm is the number of mixture components, W;„ is the mixture weight, and the 
parameters of the multivariate Gaussian mixture are its mean vector [i and covariance 
matrix S. 
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Change paragraph 0023 as follows: 

In order to simplify notation, we should ealy note that log(bj(ot)) is proportional to a 
weighted sum of the cepstral distance between the observed feature and the cepstral mean 
(pr\i X where the weighting coefficients are based on the inverse covariance matrix (S"^), 

log(bj(ot)) co(ot-^)' r>(orM). (6) 
Change paragraph 0024 as follows: 

Remember that the Np -dimensional cepstral feature Ot is obtained by performing the 
Discrete Cosine Transform (DCT) on the Ns- dimensional log Mel spectrum (5). 
Mathematically, if the Ns-x-Nf Nf x Ns dimensional matrix A/ represents the DCT 
transformation matrix, then we have Ot = MS. Reciprocally, we have S= M'^ Ot where M'^ 
(Ns x Np ) represent the matrix for the inverse DCT mafek -operation . 
Change paragraph 0027 to correct the symbol on either side of S as follows: 
With this notation, the weighted probability of observing the feature becomes 

bj{oi)= 2^wm e 2 (9) 

which can be rewritten using a back-and-forth weighted time-varying transformation 
matrix Tt= MGtM'' as 

which can also resemble the unweighted equation 5 with a new inverse covariance 
matrix 
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ko')= £ 



(11) 



Change paragraph 0030 as follows: 

In that specific case, the time and frequency SNR evaluation we are using for the purpose 
of evaluating the presented technique is that of the ETSI Distributed Speech Recognition 
standard {€] which evaluates the SNR in the time and frequency domain for spectral 
subtraction purposes. See ETSI STQ-Aurora DSR Working Group, "Extended Advanced 
Front-End (xafe) Algorithm Description," Tech. Rep., ETSI, March 2003. 
Change paragraph 0032 as follows: 

One particular instantiation of equation 12 is using a Wiener filter type equation applied 
on the linear SNR estimate to obtain. 



which guarantees that Y// is equal to 0 when ;//,/=0 and ytj approaches 1 when rjtj is 
large. 

Change paragraph 0033 as follows: 

Figure 2 illustrates the block diagram for the time and frequency weighted Viterbi 
recognition algorithm. When you have speech (speech frame t) the first step 21 is to 
estimate the SNR to get r/tj^ Then the weighting is calculated to get jtj at step 23. Then 
the transform matrix computation at step 25 is performed. This is the MGtM"^ to get Tt 
It. The next step is Viterbi decoding at step 27 to get bj(ot). Here the original MFCC 




ytif<0_ 
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feature Ot is sent to the recognizer. The original feature contains the information about the 
SNR. 
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